Skip to content

refactor: move DM fallback wiring from discord-bridge poll loop to task-bridge#352

Merged
sonichi merged 1 commit intomainfrom
refactor/dm-fallback-to-task-bridge
Apr 15, 2026
Merged

refactor: move DM fallback wiring from discord-bridge poll loop to task-bridge#352
sonichi merged 1 commit intomainfrom
refactor/dm-fallback-to-task-bridge

Conversation

@sonichi
Copy link
Copy Markdown
Owner

@sonichi sonichi commented Apr 15, 2026

Summary

Follow-up to #349 after you said "i prefer cleaner". Moves the DM fallback wiring out of the discord-bridge.py polling loop and into task-bridge.ts where the normal result-delivery path already lives.

Net diff: −33 lines (task-bridge.ts +32, discord-bridge.py −64).

What moves

Removed from src/discord-bridge.py:

  • async def poll_dm_fallback() (59 lines) — scanned results/ every 30s, subprocess-called dm-result.py for stale files not in pending_replies
  • its client.loop.create_task(...) registration in on_ready
  • import subprocess (unused after the loop is gone)

Added to src/task-bridge.ts:

  • import { execSync } from 'node:child_process'
  • Inside startResultWatcher()'s main tick, when !isClientConnected(): iterate the same files list the connected path iterates, execSync to dm-result.py per unprocessed file, then run the same bookkeeping the connected path runs (_sendTaskStatus, _deliveredResults.add, _pendingTasks.delete, /task-done POST, 10s unlink)

Why this is cleaner

  1. Latency. Old loop had a 30s poll interval + 90s grace window = worst-case ~2min between result-write and DM delivery. New path runs on the same cadence as the result watcher's natural tick — DM arrives within one tick.
  2. Right layer. task-bridge.ts is the single owner of the voice-delivery contract. Fallback is part of that contract, so it belongs next to the normal delivery path, not in a sibling process polling the same directory.
  3. No redundant state. task-bridge.ts already has _deliveredResults and isClientConnected() — no need to reinvent pending_replies-awareness / file-age / grace windows.
  4. Subprocess parity preserved. Both approaches shell out to python3 src/dm-result.py --file <f>dm-result.py's API-based owner + DM-channel resolution from discord-bridge: wire dm-result.py into result-poll flow as DM fallback #349 is unchanged.

What this PR does NOT touch

Test plan

  • ast.parse clean on discord-bridge.py
  • npx tsc --noEmit --skipLibCheck clean on task-bridge.ts
  • Manual: restart voice-agent to pick up the task-bridge.ts change, disconnect voice, trigger a task, observe DM arrives
  • Separate follow-up: discord-bridge.py doesn't need a restart — the removed loop is only called at on_ready, so an already-running bridge wasn't affected by discord-bridge: wire dm-result.py into result-poll flow as DM fallback #349 until restarted. This PR removes a function that's not executing anyway in the currently-running bridge.

🤖 Generated with Claude Code

…sk-bridge

Follow-up to #349. Owner asked for the cleaner wiring architecture
after #349 merged, because the polling-loop approach has an inherent
30s–120s latency between result-write and DM delivery (30s poll
interval + 90s grace window).

Net effect: −33 lines. One wiring point instead of two.

### What moves

- **Removed** from `src/discord-bridge.py`:
  - `async def poll_dm_fallback()` (59 lines) — the polling loop
    that scanned `results/` every 30s and subprocess-called
    `dm-result.py` for stale files not in `pending_replies`.
  - `client.loop.create_task(poll_dm_fallback())` registration in
    `on_ready`.
  - `import subprocess` (unused after the loop is gone).

- **Added** to `src/task-bridge.ts`:
  - `import { execSync } from 'node:child_process'`.
  - Inside `startResultWatcher()`'s main tick, when
    `!isClientConnected()` — iterate the same `files` the connected
    path iterates, `execSync` `python3 src/dm-result.py --file <f>`
    per unprocessed file, then run the same bookkeeping the
    connected path runs (`_sendTaskStatus`, `_deliveredResults.add`,
    `_pendingTasks.delete`, `/task-done` POST, 10s unlink).

### Why this is better

1. **Latency.** The poll loop had a cumulative 30s+90s worst case.
   task-bridge.ts runs on the same interval as the result watcher
   (2s tick per the file-loop's natural cadence), so the DM arrives
   within a single tick of the result being written. No grace period
   needed — task-bridge already tracks `_deliveredResults` so it
   can't double-send.

2. **Right layer.** task-bridge.ts is the single owner of the
   "voice-agent delivery" contract. Fallback is a property of that
   delivery contract, so it belongs next to the normal delivery path
   — not in a sibling process (discord-bridge.py) polling the same
   directory.

3. **No redundant state.** The polling loop needed `pending_replies`
   awareness, a file-age check, and a grace window. task-bridge.ts
   already has `_deliveredResults` tracking and `isClientConnected()`
   state — nothing to reinvent.

4. **Subprocess parity preserved.** Both the old polling loop and
   this path shell out to `python3 src/dm-result.py --file <f>`.
   `dm-result.py`'s API-based owner + DM-channel resolution from
   #349 is unchanged and carries over directly.

### What this PR does NOT change

- `src/dm-result.py` — untouched. All the `_resolve_owner_id`,
  `_open_dm_channel`, and `_discord_api` helpers from #349 stay as
  they are. The wiring PR just calls it differently.
- `src/discord-bridge.py` `poll_proactive()` — still handles
  `results/proactive-*.txt` with its own direct-Discord-client DM
  path. That's a different contract (explicitly write-proactive-to-DM)
  and doesn't need the voiceConnected check.

### Verification

- `python3 -c "import ast; ast.parse(...)"` clean on discord-bridge.py
- `npx tsc --noEmit --skipLibCheck` clean on task-bridge.ts
- Net diff: −33 lines across 2 files
- dm-result.py smoke test from #349 still valid — this PR doesn't
  touch the script

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sonichi
Copy link
Copy Markdown
Owner Author

sonichi commented Apr 15, 2026

Noting a racing PR: @sutando#9708's #351 also implements the cleaner wiring but keeps the poll_dm_fallback polling loop as a coexisting safety net (+28/-1). My #352 removes the loop as redundant (+31/-65, net −33).

See #351 for the comparison comment. Owner picks which interpretation of "cleaner" wins. I'll close this one if they pick #351.

@sonichi sonichi merged commit 0502234 into main Apr 15, 2026
1 check passed
@sonichi sonichi deleted the refactor/dm-fallback-to-task-bridge branch April 15, 2026 21:30
sonichi pushed a commit that referenced this pull request Apr 15, 2026
sonichi pushed a commit that referenced this pull request Apr 15, 2026
First half of the D+C DM-fallback redesign owner approved after the
2026-04-15 DM-flood post-mortem. This PR lands D (separate retention
policy) as a standalone hygiene win. C (event-driven fs.watch wiring
for the DM fallback) will stack on top as a follow-up PR.

### What this does

- Adds `src/archive-stale-results.py` — a small one-shot script that
  walks `results/*.txt`, moves any file whose mtime is older than
  `$RETENTION_HOURS` (default 24) under `results/archive-YYYY-MM-DD/`.
  Files inside existing `archive-*` subdirectories are never touched.
  Honors `DRY_RUN=1` for safe preview.
- Invokes it from `src/startup.sh` right after `mkdir -p tasks results
  data` — i.e. before any long-running service starts iterating
  `results/`. Any accumulated backlog from a prior quiet day gets
  swept out of the service iteration paths at boot time.

### Why this first, before any DM fallback rewiring

Today's incident proved that `results/` accumulates long-dead files
indefinitely: task-*, question-*, briefing-*, insight-*, friction-*
from voice-originated work that was never spoken because voice wasn't
connected at the time. Any code that scans `results/` for delivery —
my reverted #352 task-bridge wiring, #349's discord-bridge polling
loop, or any future design — is going to re-deliver the entire
backlog on first tick after a cold boot unless the directory is
actually kept small.

D (retention) fixes the underlying hygiene concern independently of
the DM-fallback question. Even if we never wire fallback again, the
directory staying bounded is an unambiguous win. With D in place, C
(the event-driven follow-up) can use `fs.watch` without needing a
complex "pre-existing files at boot" escape hatch — D has already
moved them out of the way.

### Retention window rationale

Defaulted to **24 hours**. This is a judgment call:

- Too short (e.g. 1h) — risks archiving same-day results the user
  might still want the voice agent to speak when they reconnect.
- Too long (e.g. 7d) — bounds the accumulation less aggressively.
- 24h feels right: anything older than one full day is effectively
  dead mail. Overridable per-run via `RETENTION_HOURS=N`.

### Verification

- `bash -n src/startup.sh` clean
- `ast.parse` clean on `archive-stale-results.py`
- Dry run against fresh clean results/: "no stale files to archive"
- Fabricated a 2-day-old test file, DRY_RUN=1 correctly flagged it,
  then real run correctly moved it to `results/archive-2026-04-15/`.
- Does not affect `results/calls/` (subdirectory, skipped by
  `f.is_file()` check).

### What's not in this PR

- **C (event-driven fs.watch wiring)** — stacks on this. Will replace
  the current `readdirSync(RESULT_DIR)` scan in task-bridge.ts with
  an fs.watch listener that only reacts to newly-created files.
  Separate PR, coming next.
- **Full architectural redo of DM fallback delivery** — out of scope;
  the pause-on-DM-fallback rule from the post-mortem still holds.
  This PR is pure hygiene.

### References

- Post-mortem: `notes/post-mortem-dm-flood-2026-04-15.md`
- Feedback memory: `feedback_results_dir_is_graveyard.md`
- Owner's D+C decision: Discord 23:20 local

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant